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Abstract — A lossy source coding problem with privacy con¬ 
straint is studied in which two correlated discrete sources X and 
Y are compressed into a reconstruction X with some prescribed 
distortion D. In addition, a privacy constraint is specified as 
the equivocation between the lossy reconstruction X and Y. 
This models the situation where a certain amount of source 
information from one user is provided as utility (given by the 
fidelity of its reconstruction) to another user or the public, while 
some other correlated part of the source information Y must 
be kept private. In this work, we show that polar codes are 
able, possibly with the aid of time sharing, to achieve any point 
in the optimal rate-distortion-equivocation region identified by 
Yamamoto, thus providing a constructive scheme that obtaius the 
optimal tradeoff between utility and privacy in this framework. 

I. Introduction 

An important consequence of the ubiquitous growth of 
modern information technology is that an increasing amount of 
private information is shared between different organizations 
and/or users. This entails a tension between privacy and utility 
in the sense that disclosing data provides useful information 
to the receiving entity, while at the same time posing the 
danger of leaking private information. Examples for such a 
tension can be found in many real-life systems, e.g., in social 
networks, smart grids, or databases. 

The tradeoff between utility and privacy has been the 
subject of several recent works as surveyed in IT]. A simple 
information-theoretic model to analyze this tradeoff is the 
lossy source coding problem introduced by 0, where utility 
is measured by the reconstruction fidelity and privacy by an 
equivocation (i.e., conditional entropy). Reference 0 shows 
that (vector) quantization, as realized by means of random 
coding, is optimal in the sense that is achieves any point in the 
rate-distortion-equivocation region. Several subsequent works 
0-0 have addressed related problems in which the intro¬ 
duction of distortion is used to disguise private information. 
For example, 0 focuses on database privacy in which only 
certain entries of a database are to be published. Further, the 
authors in 0 generalize the result in 0 to the case with side 
information at the decoder. While all these works consider 
achievability based on random coding, here we focus on the 
general setup in 0 and provide a constructive coding scheme 
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Fig. 1. Illustration of the problem of lossy source coding with privacy 
constraints (3, in which the privacy is measured by the leakage H{Y'^\M)/n 
and the utility by the fidelity as gauged with respect to the expected distortion 
E(d(X'",X"))/n. 

based on polar codes which achieves the optimal rate-utility- 
privacy trade-off. 

Polar codes, as first proposed in 0, are binary block codes 
which achieve the capacity of a binary symmetric memoryless 
channel with efficient encoding and decoding algorithms. The 
key property of these codes is that they yield virtual channels 
which either asymptotically converge to an error-free or a 
completely noisy channel, such that the fraction of asymptoti¬ 
cally error-free channels approaches the symmetric capacity 
of original channel. Polar codes have been generalized to 
both asymmetric channels 0, noj and arbitrary alphabets 
mil. Moreover, polar codes have been shown to achieve the 
rate-distortion bound for symmetric binary sources El and 
asymmetric binary sources under Hamming distortion in 0, 

ina. 

In the following, we show that, for the framework in 0 
under the assumption of prime source alphabets, polar codes 
are able, possibly with the aid of time sharing, to achieve any 
point in the optimal rate-distortion-equivocation region. To the 
best of our knowledge, this is the first constructive scheme that 
is provably optimal in terms of the achievable tradeoff between 
rate, utility, and privacy. 

Notation: An upper case letter A denotes a random variable 
and a denotes its realization. We let A® denote the random 
vector (Ai,..., A^). For any set S, |iS| denotes its cardinality 
and A5 denotes the vector (A^^,..., Aj|^|). 

II. System Model and Preliminary Results 

We consider the lossy source coding set-up studied in 
0 and depicted in Fig. [T] in which the encoder wishes to 
communicate a source sequence within some distortion to 
the decoder, while keeping the receiver’s knowledge about a 







correlated sequence y", also available to the encoder, below 
some tolerated level. The sources X" € T"" and F" £ 3^" 
take values in discrete alphabets X and y, and are memoryless 
with joint distribution (a;”, y”) = ]Xl=iQxY(.Xi,yi) 

for some joint pmf Qxvix, y). Encoding of these pairs leads 
to an index M = m with m £ {1, 2,..., }, where 

n is the blocklength and g is a prime number. Finally, the 
reconstruction of X" at the decoder is given by a sequence 
X" £ F”, with X = {0,1,..., q — 1}. The goal in designing 
the system in Fig. [1] is to obtain a desired tradeoff between 
the rate R, the expected distortion E(d(X", X"))/n, and the 
information leakage H{Y'^\M)/n about the source F" that 
can be obtained from observing M. For simplicity, in the 
following we will identify pmfs by their arguments only and 
drop any subscripts. 

We now define the operation of both encoder and decoder 
and the notion of the rate-distortion-equivocation region. To 
this end, we introduce a standard bounded distortion metric 
d : X X X ^ [0,dmax], where dmax < c)o is the maximal 
distortion. 

Definition 1 (Code). An (n, R, D, A) code consists of an en¬ 
coding function that maps each sequence (x", y") £ F" x 
to an index m (x", t/") £ {1,2,..., } and a decoding 

function that maps each index m to an estimate x" (m) £ 
F", such that the average distortion iE(d(A”,A”)) = 
i E(d(Ai, Ai)) satisfies the inequality 

-E(d(X^,X^)) < D, (1) 

n 

and the equivocation rate guarantees the inequalit}0 

-iT(F”|M) > A. (2) 

n 

Definition 2 (Rate-distortion-equivocation region). A triple 
{R,D,A) is said to be achievable, if, for any e > 0 and 
n sufficiently large, there exists an (n, R, D + e, A — e) code. 
The closure of all achievable triples TZ* is referred to as the 
rate-distortion-equivocation region. 

Remark 1. The distortion D can be constrained without loss of 
generality to lie in the interval [0, dmax], while the equivocation 
A may range in the interval [H{Y\X)^ H{Y)]. 

A. Preliminaries 

Lemma 1 (IS). The rate-distortion-equivocation region TZ* 
is given by the closure of the union of all tuples (i?, D, A) 
such that the inequalities 

R < I{XY]X) (3a) 

D < E{d{X,X)), (3b) 

A > H{Y\X), (3c) 

hold for some pmf P (x, y, x) that satisfies 

^ P(x,y,x) = (3(x,y), V(x,y) £ F X 3^. (4) 

^All the entropies will be computed with base q logarithms and all 
summations are done modulo q. 


Remark 2. From a pmf P(x,y,x), the test channel 


lF(x,y|x) 


P{x,y,x) 


( 5 ) 


can be calculated. In El, two specific binary examples are 
worked out, namely a source in which the correlation between 
the binary variables X and F is a Z-channel, and a doubly 
symmetric binary source, both under Hamming distortion. 
From the results in El, it can be inferred that test channels (5) 
that yield boundary points on the rate-distortion-equivocation 
region for the former case are generally asymmetric, while for 
the latter they can be assumed to be symmetric with no loss 
of optimality. We recall that a channel lF(x, y\x) is said to be 
symmetric if there exists a permutation 7r(x, y) of the output 
alphabet F x 3^ such that, the identity 7r(x, y) = 7r“^(x, y) 
holds and the equality lF(x, y|l) = IF(7r(x, y)|0) is satisfied 
for all (x, y) £ F X 3^. 


III. Optimality of Polar Codes 

Let us define as P(x, y, x) a pmf that achieves an operating 
point of interest in the rate-distortion-equivocation region TZ* 
in Lemma 1. Let us also define as R* = I{XY] X), D* = 
E(d(A, A)), and A* = H(Y\X) the rate, distortion and 
equivocation attained under such distribution P(x,y,x), re¬ 
spectively. In this section, we demonstrate that polar codes can 
achieve any such triple {R*, D*, A*) in TZ*. As mentioned, 
we focus in the following on the case of a prime size alphabet 
F = {0,l,...,y — 1}, although extensions to alphabets of 
arbitrary cardinality are possible by following im. 


A. Lossy source coding via polar codes 

We consider a polar coding scheme that is a variant of the 
approach proposed in ||9| for asymmetric sources, which is in 
turn inspired by ifTOll . lfT3ll . and extended to prime alphabets by 
applying results of E2. We fix a joint distribution P(x, y, x) 
that achieves a desired point (R*,D*, A*) in TZ*. To start, let 
us define the following joint distribution on the set F” x 3^” x 
F" X W" where U = {0,1,..., q - 1}: 

n 

P(x”,y”,x”,M") = Y[Qixi,yz)P{xi\xi,y^)l{u^ = x^G^}, 

( 6 ) 

with n = 2^ for some integer k, G„ = G®^ is the polarizing 

transform with ^ ^ | denotes the fc-times 

Kronecker power, and P(x|x,y) = P{x,y,x)/Q{x,y). The 
distribution (6) can be interpreted as providing the target joint 
distribution over variables (A", F”, A") since, under (6), it is 
easy to see that the desired distortion D* and equivocation A* 
are attained (see El). The challenge is to construct a coding 
scheme that mimics (6) without having to transmit a message 
u," of n symbols and hence of rate R = 1 from encoder to 
decoder. Note that the matrix G„ satisfies G„ = G“^ and 
hence, from m”, one can recover x” as x” = G„u” m. 

As explained in the following, the encoder maps the sources 
(x”, y") into a vector rt", which is divided into two subvectors. 



namely the information vector ux, indexed by the set I of size 
|X| = nR symbols and the complementary vector uic. The 
information vector ux constitutes the message M sent by the 
encoder to the decoder. We partition the set into two sets, 
namely, the set F that identifies the "frozen" symbols uj- and 
the set V that identifies the "computable" symbols ut>- These 
sets are defined as 

e [1 : n] : X”,y") > 1 

(7) 

and X 4 e [1 : n] : X(C/i|C/*"^) < 2 -"'’} , ( 8 ) 

where /3 < i is a parameter of the underlying polar coding 
scheme. Further, the source Bhattacharyya parameter Z for 
two random variables A £ {0,1,..., g — 1} and B € B is 
defined as 

Z{A\B)^^ J2\JP^FiFb)PAMa',b). (9) 

^ a,aFA: bGB 
a^a' 

The Bhattacharyya parameters in Q and (| 8 j are calculated 
based on the joint distribution y", x", u") given in ( 6 ). 

From Theorem 1 ] and ifTTl Theorem 4.3], the size nR of 
the set X = {1, 2,..., n} \ (X U X>) is such that the rate R is 
arbitrarily close to R* as n grows large. 

To determine the vector m", the following randomized 
successive encoding rule is used for i = 1,2,... ,n: 

Ui with probability x”, y”) ifi G X, 

Ui £14 with probabilityifi £ X>, 

( 10 ) 

where the probabilities in (10) are obtained from ( 6 ). The 
symbols ux are predetermined and are available at the decoder 
prior to encoding. The vector ux is sent to the decoder, while 
the decoder obtains the vector ut> according to a maximum 
likelihood rule as in cni, ini: 

{ Ui for i £ X, 

= argmaxuew P{u\u^~^) for i £ X>, 

Ui for i £ F. 

( 11 ) 

Finally, the codeword x" is evaluated as x" = GnuA- 
Remark 3. Note that the decoding rule (11) does not require 
encoder and decoder to share the set of Boolean functions 
needed by the scheme in ii (see also Eoi, in]), hence 
significantly simplifying the implementation. 

Remark 4. If X" is i.i.d. uniformly distributed in X'^ under 
( 6 ), it follows from El, m that the set T> has negligible 
size as n grows large and hence the encoding and decoding 
rules ([Tol l and dnii can be simplified by setting X> = 0 
as done in na. This condition applies, for instance, to the 
doubly symmetric binary source studied in El (see Remark 
2). Moreover, the encoding rule (fTOl i with X> = 0 entails 
that the set of codewords X" consists of the (approximately) 
quR sequences of a block coset code defined by the generator 
matrix G„ and by the frozen symbols ux- 


B. Optimality of polar codes 

In this section, we establish the optimality of polar codes for 
the problem at hand. We start with the following proposition 
that entails randomization over the frozen bits. The need for 
randomization is removed in Proposition 2 @ 

Proposition 1. Fix a triple (R* = I{XY;X),D* = 
¥.(d{X, X)), A* = H{Y\X)) achieved by a joint distribution 
P{x,y,x) in the rate-distortion-equivocation region TZ*. For 
any 0 < /3' < /3 < i, any e > 0, and for sufficiently 
large n, the sequence of rates Rn = — |X|, distortions Dn = 
X")), and equivocations An = ^Fl{Y'^\Ux) that 

satisfy 

Rn<R*+e, (12a) 

Dn < D*+ 0{2-^'''), (12b) 

A„ > A*- 0 ( 2 -"'"') ( 12 c) 

is achievable by the polar coding scheme (lO)-(ll), where 
the distortion Dn and the equivocation A„ are averaged over 
uniformly distributed frozen symbols ux- 

Proof: We first define the joint distribution induced by the 
encoding rule (fTOl i under the assumption that the frozen sym¬ 
bols are selected as i.i.d. uniform variables with probability i 
according to 

P- (x", y", u", x") = Q (x", y") P 

■ n P y") • l{x" = u^Gn}. (13) 

lei 

Note that in (fTSl) the codeword X" is defined based on the 
symbols [/" selected by the encoder. We also introduce the 
joint distribution that includes both (flOt and the decoding rule 
in (fTTT) as 

P^ (x^ y^ u^, u^,x^) = Q (x", y") J] P 

iGT> 

iex ieRUl 

• n IR = /*(«*■')} ■ l{i” = (14) 

The rate condition (I12al i follows directly by extension of the 
arguments in 13 Theorem 1] to alphabets of prime size and 
holds for any choice of the frozen vectors. To prove (I12bl) 
for the ensemble of codes inducing the joint distribution by 
(O, we need to modify the arguments in a in order to 
account for possible decoding errors. To this end, we define 
the probability of error as Pe = Prpe[?7^ ^ Denoting 

^The notation f{n) = 0{g{n)) means that there exist constants tiq and c 
such that for all integers n > no the inequality \f(n)\ < c |^(n)| holds. 

^In the following, subscripts are used to identify the distribution with 
respect to which probabilities, expectations, and information measures are 
computed. 




the decoding error event as E = {U^ ^ J7”}, the distortion 
Dn{ujr) averaged over the frozen vectors ujr satisfies 

Ep.[i?„(C/^)] =-((!- Pe)Ep.[d(X",X-) I E-] 
n V 

+ PeEp4d{X^,X^)\E]^ 

< - ({l-Pe)Ep4d{X^,X4\E^] 
n V 

PPedmax^, (15) 


by the law of total probability and the boundedness of the 
distortion metric. Moreover, we have 


P‘^{x^,y^,u^,u^,x^\E4 = 


(cc”, y", u”, x”) 1 { u" = u" } 


l-Pe. 


and hence the first term in (fTSl l can be computed as 


(16) 


Epd [d{X^,X4\E‘^] 

^ x'^ ,y'^ ,u'^ ,u'^ ,x^ 

•l{u" =M”}d(x",x") 

Furthermore, by IfTSl Property 2] we have the inequality 


-Epe[d(X",X")l < -Ep\d(X^,X4] + — 
n ^ n ^ n 

where || • || denotes the variational distance of two distributions, 
and, by construction, ^Ep [d{X'^, X”)] = D* holds true. The 
variational distance in (fTSl l can be characterized by following 
similar steps as in |l9|, ifTSl as (see the Appendix for a sketch) 


Up - — p® 


0 ( 2 "”' 


) 


(19) 


for any /3' < /3. Finally, we obtain the following bound on the 
probability of decoding error 


Pe § Z{U^\W-4 < \V\ 2"’^'’ 0{n2-^^), (20) 

iec 

where (a) follows from fS] Proposition 2], (b) is a consequence 
of the definition of the set V in ( 8 ), and (c) follows by noting 
that the cardinality of the set V is at most linear in n. Using 
(fTSl l. along with (fTTl i- dSOl i. we have 

Ep.[i^„(C7^)] < i(Epe[d(X",X")] + Ped:„ax), 

< - [nD* + d^^4Pe 

+ II ~ ^X”-,Y^,X^,U'' 

= £>*+ 0 ( 2 -”'’'), ( 21 ) 


which allows us to conclude that the distortion inequality (I 12 bb 
is satisfied on average over the choice of the frozen vectors. 


To prove (I12cl ). we first observe that, by construction, we 
have iiFp(y”|X”) = A*. The achievable average equivoca¬ 
tion satisfies the equality 

Ep.[A„(C/^)] = -Hp4Y4Ux,Up) 

n 

= iiFpe(y"|f/p,[/p) =Epe[A„(C/^)], 

n 

( 22 ) 


and further we have 

Epe[A„(f/^)] = -Hp4Y4Ui,Ux) 

n 

> -Hp 4Y4U^) = -Hpe{Y4X4, (23) 
n n 

where the inequality in (| 2 ^ holds since conditioning reduces 
entropy, and the subsequent equality holds due to the one-to- 
one correspondence between X" and C/" under P®, respec¬ 
tively. 

Using both the chain rule and the triangle inequality, we 
obtain 


|iFp(r’"|X’")-iFpe(r”|x”)| < 

I ijp (r", X’") - Ppe (r ”, x”") I 

-b I iJp (X”) - Hpe (X") I. (24) 

Now, by considering 

\\Px - Qxll = '^\'^Pix>y) - Qix,y) 

X y 

< X! ~ Qi^^y)\ = \\Px,Y - Qx,y\\ (25) 

x,y 

and by applying M Lemma 2.7] with (O and (25), we 
finally obtain the bound 

|iJp(F"|X’") - Hpe{Y^\X4\ < 0{n^'2^'''). 

This shows that (I12cl) is satisfied on average over the choice 
of the frozen vectors. ■ 

We now show that averaging over all frozen vectors is not 
required to achieve the region {R*, D*, A*). 

Proposition 2. Any tuple (R*, D*, A*) in (I12al i. (I12bb . and 
(fna is achievable by time sharing between at most two 
polar coding schemes defined by cni and (HB with difife rent 
sequences ofifirozen symbols ux- 

Proofi: We prove this statement by contradiction. To 
elaborate, if a sequence of frozen vectors up exists such 
that for any fixed e > 0 both conditions ( I12bb and ( I12cb 
are satisfied, namely Dn{up) < D* -b e and An{up) > 
A* — e, then the proof is complete. Now, we assume that 
none of the vectors up satisfies both conditions. By the 
discussion above, we can find a sufficiently large no such 
that Epd[Dn{Up)\ < D* + e and Epd[A„(C/p)] > A* — e 
for all n > riQ. Consider a coordinate system with origin 
at (D* -b e, A* — e) in the distortion-equivocation plane (see 
Fig. 2). By assumption, for none of the vectors up the point 











{Dn{ujr), An{ujr)) is in the second (upper left) quadrant, 
while the average (Epd[D„(17;r)],Epd[A„(C/ jf)]) lies in the 
second quadrant. Moreover, the average is in the convex hull 
of the points (Z)„(mp), A„(up)), which is a polytope. By 
simple geometric arguments, one of the edges of this polytope 
must cross the second quadrant. Therefore, if the vertices 
of this crossing edge are denoted as (Z?„(ttp-i), A„(itpi)) 
and {Dn{ujr 2 ), An{uj^ 2 )), then we can find 0 < a < 1 
such that £>'1' = + (1 — a)Dn{uj^ 2 ), and A^ = 

aA„(Mpi) + (l —a)A„(Mp 2 )i and {D^^ A'l') lies in the second 
quadrant, hence completing the proof. ■ 



Fig. 2. Convex hull of points in the equivocation-distortion plane. 

Remark 5. For the important case of the doubly symmetric 
source and Hamming distortion, time sharing is not necessary. 
Hence, there exists a single polar coding scheme defined by 
dTOt and (fTTI) with a specific choice for the sequence of frozen 
bits ujr (and 2? = 0, see Remark 4) that achieves the desired 
point {R*,D*,A*). This can be seen from the fact that for 
each vector ujr we have Epd [£)„({7p)] = Dn{uj-) ifT^ . Now, 
since we know that Epd[A„(?7p)] > A* — e, there must be 
at least one frozen vector ujr such that A„(up-) > A* — e, 
which completes the proof. 
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Appendix 


Proof sketch of (fl^ : 


P - — P® - 

H ^ ,U'^ ,X'^\\ 




i=l 

i—1 n 

Q{x-,y-)\{P{u^u^-\x-,y-) H 

u'^ ,x'’^ ,y'^ 

i-1 

<E E Pi^'-\^^,y")- 


f {2\n2)D(Pu.\^i-l ^yn II .yn 

= E V(21n2)i?(Pc/, II P=J(7-i,A",y-) 




(«=) 


< E \/(21n2)(l - {Z{Ui\W-\X^,Y^)y) 


(/) 

< nw'(4In 2)2-"'^ =0(2""" ) 


Here, the equalities and inequalities follow from (a) a tele¬ 
scopic expansion, (b) the fact that the distributions P and P® 
are the same for i ^T, (c) Pinsker’s inequality where P(-||-) 
is the relative entropy, (d) im Lemma 10], (e) M Proposition 
4.8], (f) (Ell. 



















